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Acronyms 


¢ Combinatorial logic (CL) ¢ Probability of configuration upsets 

* Commercial off the shelf (COTS) (P configuration) 

* Complementary metal-oxide ¢ Probability of Functional Logic upsets 
semiconductor (CMOS) (PiunctionalLogic) 

* Device under test (DUT) ¢ Probability of single event functional interrupt 

¢ Edge-triggered flip-flops (DFFs) (Pseri) 


* Probability of system failure (P.,<tem) 

e Processor (PC) 

¢ Radiation Effects and Analysis Group (REAG) 
¢ Reliability over time (R(t)) 

¢ Reliability over fluence (R(®)) 

¢ Single event effect (SEE) 

¢ Single event functional interrupt (SEFI) 

¢ Single event latch-up (SEL) 

¢ Single event transient (SET) 

e Single event upset (SEU) 

¢ Single event upset cross-section (O<-y) 

¢ Xilinx Virtex 5 field programmable gate array 


¢ Error rate (A) 

¢ Error rate per bit(A,;,) 
- Error rate per system(A,, com) 

¢ Field programmable gate array (FPGA) 
¢ Global triple modular redundancy (GTMR) 
¢ Hardware description language (HDL) 

e Input — output (I/O) 

¢ Intellectual Property (IP) 

e Linear energy transfer (LET) 

e Mean fluence to failure (MFTF) 

¢ Mean time to failure (MTTF) 

¢e Number of used bits (#Usedbits) 


; (V5) 
¢ Operational frequency (fs 
. queney >) e Xilinx Virtex 5 field programmable gate array 
* Personal Computer (PC) radiation hardened (V5QV) 
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Problem Statement 


Conventional methods of 
applying single event upset 
(SEU) data to complex systems 
implemented in field 
programmable gate array 
(FPGA) devices need 
improvement. 


The problem boils down to 
extrapolation and application of 
SEU data to characterize system 
performance in radiation 
environments. 
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Abstract S 


e Weare investigating the application of classical reliability 
performance metrics combined with standard SEU analysis data. 
e We expect to relate SEU behavior to system performance 
requirements... 
— Example: The system is required to be 99.999% (5-nines) reliable within a 


given time window. Will the system’s SEU response meet mission 
requirements? 


— Our proposed methodology will provide better prediction of SEU 
responses in harsh radiation environments. 
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Background Ss 
FPGA SEU Susceptibility Measured in SEU Cross 
Section (O¢-y) 


° Ocpys (per category) are calculated from SEE test and analysis. 
e FPGAs vary and so do their SEU responses. 


¢ Most believe the dominant o,,,s are per bit (configuration or 
functional logic). However, global routes are also significant. 


Osrys are measured Osrys are measured 
by bit by bit 
P (fs ) system oc £7 Configuration +P (fs ) functionalLogic + FF SEPT 
Design Osey Configuration Oscy Functional logic SEFI Osey 
. ; Sequential and 
For functional logic, should Combinatorial Global Routes 


Ospys be measured by bit???: eee aii Logie 
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Background Sy 
(Current Goal: Convert SEU cross-sections (O.-y: 
cm?/(particles)) to error rates (A) for complex systems) 


Osry = #errors/fluence 
A = ee 


e Perform SEU accelerated radiation testing 
across ions with different linear energy 
transfers (LETs) to calculate og-yS per 
LET. 

¢ Bottom-Up approach (transistor level): 

— Given ogy (per bit) use an error rate 
calculator (such as CREME96) to : 
obtain an error rate per bit (A, )- ee 

— Multiply 4,,, by the dominant number 
of used memory bits (#UsedBits) in the 
target design to attain a system error 
rate (1... tem): 

¢ Top-Down approach (system level): 

e Given Oc cy (per system) use an error 
rate calculator (Such as CREME96) to 
obtain an error rate per bit (A, tem): 
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system 


Technical Problems with Current 
Methods of Error Rate Calculation 


¢ For submission to CREME96, Oscy 


data (across LET) Is fitted toa 
Weibull curve. Top-down o,,, Data versus LET 


1.00E-01 -— — 


— The two main parameters for curve 
fitting are a shape factor andaslope = _1.00E-02 -— —— 


sy 
naetOr _ D 1.00E-03 -—— a 
— During the curve fitting process,a & 
large amount of error can be ee 
introduced. 
5 1.00E-05 -——— —- 
— Consequently, it is possible for ca - 
resultant error rates (for the same o L.00E-06 | 
design) to vary by decades. 1.00E-07 > 
e Because of the error rate calculation canal 


process, O.-y data is blended 
together and it is nearly impossible 
to hone in on the problem spots. 
This can become important for 
mitigation insertion. 
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Technical Problems with Bottom-Up 
Analysis Method (1) 


e Multiplying each bit within a design by 4,,, is 
not an efficient method of system error rate 
prediction. 


— Works well with memory structures... 
but...complex systems do not operate like 
memories. 

— If an SEU affects a bit, and the bit is either 
inactive, disabled, or masked, a system 
malfunction might not occur. 

¢ Using the same multiplication factor 
across DFFs will produce extreme over- 
estimates. 

¢ To this date, there is no accurate A system < Avie Used Bits 
method to predict DFF activity for 
complex systems. 

¢ Fault injection or simulation will not 
determine frequency of activity. 
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Technical Problems with Bottom-Up Ss 
Analysis Method (2) 


e There are a variety of components 
that are susceptible to SEUs 
(clocks, resets, combinatorial 
logic, flip-flops (DFFs, etc...)). “ 

— Various component susceptibilities 
are not accurately characterized at 
a per bit level. 

— Design topology makes a 
significant difference in 
susceptibility and is not 
characterized in error rate 
calculators (e.g., CREME96). 


Error rates calculated at the transistor-bit level are 
estimated at too small of granularity for proper 
extrapolation to complex systems. 
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¢ Classical reliability 
models have been used 
as a standard metric for 
complex system 
performance. 


¢ The analysis provides a 
more in depth 
interpretation of system 
behavior over time by 
using system-level MTTF 
data for system 
performance metrics. 


R@=e™"' or R(t)=e“ 
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Failure Rate (A(T)) Bathtub Curve Ss 
(Weibull Probability Density Function (PDF)) 


0.0030 : ; : 
— Infant Mortality... error rate decreaes with time 
er —Useful Life...Random errors (constant error rate) 
) —Wear Out Life ...error rate increases with time 
0.0020 


We will focus on the 
0.0015 “Useful Life” of the bathtub 
curve for this analysis. 


Failure Rate (Faliures/Time) 


0.0010 
0.0005 
__ 
0.0000 
10 2010 4010 6010 8010 


Time 
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Mapping Classical Reliability Models from Sy 
The Time Domain To The Fluence Domain 


¢ The exponential model that relates reliability to MTTF 
assumes that during useful-lifetime: 


— Failures are random. RW@=e™TTF or R(Q=e“" 
— Error rate is constant. Weibull slope =]... exponential. 
— MTTF=1/). —___ 

- Fora given LET (across fluence): Parallel between 
— SEUs are random. time and fluence. 
— Ocpy IS constant. Oscy = #errors/fluence 
— MFTF = Io... A system = #errors/time 


e Hence, mapping from the time domain to the fluence 
domain (per LET) is straight forward: 
—-to@® 
— MTTF © MFTF 
~h & egy R()=e ITF rm R(®)=e?MFIF 


To be presented by Melanie D. Berg at the Single Event Effects (SEE) Symposium and Military and Aerospace Programmable Logic Devices (MAPLD) Workshop, La Jolla, CA, May 22-25, 2017. 12 


Creating Reliability Curves from o.-yS oS 


1.00E-02 
° Ospy data is system level. re = - 
¢ A histogram of environment me. ances 
: . . = - 
data is created. Bins are a a q0eas 
determined by LET values at 5 piete : 
each O,-y data point. > o0E.07 
- sé Qo ~ OO 
e For each data point at a given D icenas, | . 
LET, a combination of binned 0.0 20.0 40.0 60.0 
. ‘* 2 
environment data and upper- one ee ee 
bound Oseu data are used to eel 
determine system reliability 
performance. 


e A piecemeal approach is 
performed per data point to 
determine the weakest points ” §t00 Mile Al Shielding 
of system performance. 


Flux (#/cm?/day) > LET 


10° 10! 10° 
LET (MeV*cm2/mg) 
M. A. Xapsos, IEEE NSREC Short Course, Ponte Vedra 
Beach, FL, 2008. 


To be presented by Melanie D. Berg at the Single Event Effects (SEE) Symposium and Military and Aerospace Programmable Logic Devices (MAPLD) Workshop, La Jolla, CA, May 22-25, 2017. 13 


Example Sy 


e Mission requirements: 

— The FPGA shall contain an embedded microprocessor. 

— Decision shall be made to select a Xilinx V5QV 
(approximately $80,000 per device) or a Xilinx V5 with 
embedded PowerPC (less than $2000.00) per device. 

— FPGA operation shall have reliability of 3-nines (99.9%) 
within a 10 minute window at Geosynchronous Equatorial 
Orbit (GEO). 

¢ Proposed methodology: 

— Create a histogram of particle flux versus LET for a 10- 
minute window of time for your target environment. 

— Calculate MFTF per LET (obtain SEU data). 

— Graph R(®) for a variety of LET values and their associated 
MFTFEs. R(®)=e®™FTE 

— For selected ranges of LETs, use an upper bound of particle 


flux (number of particles/cm2*10-minutes), to determine if 
the system will meet the mission’s reliability requirements. 
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Flux versus LET Histogram for A 10- 


minute Window 


Geosynchronous Equatorial Orbit (GEO) 
100-mils shielding 


1.0E+03 


1.0E-01 
1.0E-02 
1.0E-03 


1.0E-05 


Be 
Go G&S ©& 
m om m 
Go Oo ©& 
on oO 
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METF versus LET for the Xilinx V5 MicroBlaze Ss 


Soft Processor Core and the Xilinx V5QV 


1.0E+08 


1.0E+07 


1.0E+06 


1.0E+05 


1.0E+04 


MFTF (particles/cm?) 


1.0E+03 


1.0E+02 


embedded PowerPC Core 


@ V5QV: MicroBlaze with Cache Enabled 


m= V5: PowerPC MFTF = Ioccy 
Note: no system errors were 
% . os observed for V5QV at 
e LET<3.6MeVcm2/mg. 
However, configuration bit 


errors were observed (design 
e dependent). 


We are focused on 
system 
performance. 


20 40 60 80 100 
LET MeVecm2/mg 
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Reliability across Fluence at 


S 


LET=0.07MeVecm2/mg And Below 


V5QV: no system errors 
were observed below 
LET=3.6MeVecm2/mg. 
Total fluence > 5.0x108 
particles/cm2. 


PowerPC: 


— No system errors were 
observed from an 
LET=0.07MeVecm2/mg 
with total fluence = 
1.0x10° particles/cm/2. 


— Hence, at 0.07, we will 
assume an upper-bound 
MFTF = 1.0x108 
particles/cm2. 

— More tests would 
increase the MFTF for 
this bin. 


MFTF (particles/cm?2) 


1.0E+08 


1.0E+07 


1.0E+06 - 


1.0E+05 ~ 


1.0E+04 


1.0E+03 - 


1.0E+02 
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- 
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Reliability across Fluence up to LET=0.07 


MeVecm2/mg -— Low Bound Analysis 


Binned GEO Environment data shows approximately 3000 
particles/(cm2e10-minutes), in the range of 0.0MeVecm2/mg to 
0.07MeVecm2/mg. We are using MFTF for 0.07MeVecm2/mg to upper 


bound this bin. 


1.000000E+00 - 
9.999800E-01 - 
9.999600E-01 - 


Reliability 


9.998800E-01 - 
9.998600E-01 - 


9.998400E-01 


9.999400E-01 - 
9.999200E-01 - 
9.999000E-01 - 


aaa PowerPC: MFTF = 1.0x108 


R(@)=e1.0%10° 


Used MFTF= 1.0x108 because that was the 
maximum fluence for tests (no errors observed) 


TTTTITITITITITIIIITITITIITItIitiiririiiritrrirrirritirtiritiitirirtirritirirriiriritrriririritriitiitiitriiiitli 


0 


1000 2000 3000 4000 5000 6000 7000 8000 9000 
Fluence (particles/cm2) 


Reliability at 3000 particles/(cm2*10-minutes) > 99.99% for the PowerPC 
design implementation. “9’s” could be increased with more tests. 
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Reliability across Fluence up to Ss 
LET=0.14MeVecm2/mg 


Binned GEO Environment data shows approximately 11 
particles/(cm2°10-minutes), in the range of 0.07MeVecm2/mg to 
0.14MeVecm2/mg. We are using MFTF for 0.1MeVecm2/mg to upper 
boundcbthie bun. 


9.999990E-01 _——™, ——=—PowerPC--V -5-6 
9.999980E-01 Oo 
en 


9.999950E-01 


Reliabilit 


9.999940E-01 Rf @)=e 
9.999930E-01 


9.999920E-01 
2.5 5 7.5 10 125 15 17.5 20 22.5 


oO 


Fluence (particles/cm?2) 


Reliability at 5 particles/(cm2*10-minutes) > 99.999% for the V5QV 
PowerPC design implementation. 
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Reliability across Fluence up to LET=1.8 Ss 
MeVecm2/mg 
Binned GEO Environment data shows approximately 9 
particles/(cm2e10-minutes), in the range of 0.14MeVecm2/mg to 
1.8MeVecm2/mg. We are using MFTF for 1.8MeVecm2/mg to upper 
bound thPs°oR Fro 


9.998000E-01 


PowerPC: M = 6.0x104 


> 9.997000E-01 Re 
5 9.996000E-01 ee 
© 9.995000E-01 ee 
9.9e4000c-01 We fall below 99.99% 
at approximately ere 


9.993000E-01 5particles/cm7! 


9.992000E-01 
0 4 8 12 16 20 24 28 


Fluence (particles/cm2) 
Reliability at 9 particles/(cm2*10-minutes) > 99.9% for the PowerPC 
design implementation. This is the most susceptible bin for the system. 
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Reliability across Fluence up to Ss 
LET=3.6MeVecm2/mg 


Binned GEO Environment data shows approximately 0.23 
particles/(cm2e10-minutes), in the range of 1.8MeVecm2/mg to 
3.6MeVecm2/mg. 


1.00000E+00 : 

——. == V5QV: MFTF= 3.0106 
9.99950E-01 PowerPC: MFTF = 1.2x102 
9.99900E-01 


S coca | 
‘2 9.99850E-01 
xT) 
O& 9.99800E-01 
R(D)=e%3-0%10° 
9.99750E-01 R(@)=e9/1.2*10" 
9.99700E-01 
0 1 2 3 4 5 6 7 8 9 10 


Fluence (particle/cm2) 


Within this LET range, reliability at 0.23 particles/(cm2*10-minutes) 
> 99.999% for both design implementations. 
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Reliability across Fluence at Ss 
LET=40MeVcm2/mg 


Binned GEO environment data shows approximately 0.07 
particles/(cm2e10-minutes), in the range of 3.6MeVecm2/mg to 


40.0MeVecm2 liad 
0.9999 J5OV" WN = 7.0x1¢ 
0.9998 iii —— PowerPC: MFTF = 2. 8x10? 


0.9997 


pape: 


0.9996 +e fall below 99.99% 
at approximatel 
0.02particles/cm?! R(®)= e2/2.8x107 


0.9995 R( —p&/7.010° 


0.9994 
0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1 


Fluence (particle/cm2) 
Within this LET range, reliability at 0.07 particles/(cm2e10-minutes) > 
99.9% for both design implementations. We can refine by analyzing 
smaller bins. 
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Example Conclusion ‘~ 


Using the proposed methodology, the commercial Xilinx 
V5 device will meet project requirements. 


In this case, the project is able to save money by 


selecting the significantly cheaper FPGA device and gain 
performance because of the embedded PowerPC. 
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Conclusions Nasal 


e This study transforms proven classical reliability models into the 
SEU particle fluence domain. The intent is to better characterize SEU 
responses for complex systems. 

e The method for reliability-model application is as follows: 

— SEU data are obtained as MFTF. 

— Reliability curves (in the fluence domain) are calculated using 
MFTF; and are analyzed with a piecemeal approach. 

— Environment data are then used to determine particle flux 
exposure within required windows of mission operation. 

e The proposed method does not rely on data-fitting and hence 
removes a significant source of error. 


e The proposed method provides information for highly SEU- 
susceptible scenarios; hence enabling a better choice of mitigation 
strategy. 


e This is preliminary work. There is more to come. 


This methodology expresses SEU behavior and response in terms that 
missions understand via classical reliability metrics. 
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